Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter files early in bytestream uploader to avoid redundant work and double handling #23252

Conversation

Silic0nS0ldier
Copy link
Contributor

@Silic0nS0ldier Silic0nS0ldier commented Aug 9, 2024

Re-implementation of #20575 and a test case for minimal mode.

In this PR;

  • Paths affected by --remote_build_event_upload=minimal (default) are treated similar to those affected by the tag no-remote-cache (they are omitted).
    The actual paths are still resolvable by the returned resolver (bytestream:// URIs that do not reflect an uploaded asset are intended, as per discussion in Move BES upload check earlier, to prevent calling FindMissingBlobs #16999)
  • Latter paths skip file kind checks, since the information is only needed to upload and has overhead.
  • Made calling isDirectory and isSymlink on an omitted file an error (they would otherwise return false, which is technically incorrect).

This should improve performance. Benchmarking (in Bazel v6) has shown 54% of a Bazel profile for a fully cached build is spent in readPathMetadata.

The test case in this PR appears to be the only such test for minimal mode (at least in Java code).

Fixes #20576

TODO

  • Run new test case against current implementation to ensure no behaviour differences (and if there are, that the new behaviour is intended).
  • Give PathConverterImpl the full list of files, while only a subset is uploaded under minimal they paths are still needed for protos.
  • Apply optimisations to all omitted files (e.g. no-remote-cache tagged)

@Silic0nS0ldier
Copy link
Contributor Author

Superseded by #23374

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

perf: bazel computes digests for too many files when remote_build_event_upload is set to minimal
1 participant